image.png

WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage. A series of data were gathered about some of the dogs on WeRateDogs page. Some of the information contained in the data are:

  1. Tweet_id
  2. Image of the dog
  3. Name of the dog
  4. Breed prediction
  5. Stage of the dog (puppo, doggo, floofer or pupper)
  6. Rating numerator and denominator
  7. Favorite count, and
  8. Retweet count.

These information were analyzed, and I’ll be sharing with you, some interesting outcomes of my analysis.

Do you think, there is any correlation between favorite_count, retweet_count and rating numerator? I would have expected that any dog with a high rating should have a high retweet and/or favorite count. This however, was not the case. Below is a graphical representation between these variables:

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px
df = pd.read_csv('twitter_archive_master.csv')
In [2]:
#getting the correlation between retweet_count and favorite_count
fig = px.scatter(df, x="retweet_count", y="favorite_count", color="tweet_id",trendline="ols")
fig.show()

There is a strong correlation between these two values (retweet_count and favorite_count) with correlation coefficient of 0.86

Now, here's that of rating numerator against retweet_count

In [3]:
#getting the correlation between retweet_count and rating_numerator
fig = px.scatter(df, x="rating_numerator", y="retweet_count", color="tweet_id",trendline="ols")
fig.show()

There is a very weak correlation between rating_numerator and retweet_count with correlation coefficient of 0.08. A weak correlation can also be implied between rating numerator and favorite count since favorite count and retweet count are strongly correlated

A further review was done to check the dog with the highest and lowest favorite count.

The dog with the highest favorite count of 156,628 happened to be a Labrador_retriever, also with a high numerator of 13 against rating denominator of 10

image.png

While the dog with the lowest favorite count of 72 is an English_setter. On the contrary, this dog has a high numerator raing of 11 against denominator of 10. This further supports that there is a weak correlation between rating_numerator and favorite count.

image.png

The highest numerator rating in the data set is 14, and here's a list of the dog breeds on 14 list!:

Pembroke, Samoyed, French_bulldog, Chihuahua, black-and-tan_coonhound, bloodhound, golden_retriever, Bedlington_terrier, Rottweiler, Pomeranian, Irish_setter, Gordon_setter, standard_poodle, French_bulldog, golden_retriever, Eskimo_dog

That'll be all on the sumamry of my analysis

In [ ]: